AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multimodal Visual Representation

# Multimodal Visual Representation

Webssl Mae1b Full2b 224
A 1-billion-parameter Vision Transformer model trained via masked autoencoder self-supervised learning on 2 billion web images, capable of learning visual representations without language supervision.
Image Classification Transformers
W
facebook
36
0
RADIO B
RADIO is a vision foundation model developed by NVIDIA Research, capable of unifying visual information across different domains for various vision tasks.
Image Segmentation Transformers
R
nvidia
999
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase